Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix block splitter minor MSAN warning. #2558

Merged
merged 2 commits into from
Mar 25, 2021

Conversation

senhuang42
Copy link
Contributor

@senhuang42 senhuang42 commented Mar 25, 2021

lastCountSize wasn't always initialized when it was used in superblocks or block splitting, causing an msan warning. This only showed up later in fuzzing because lastCountSize exists purely to deal with a random edge case from 1.3.4, so was unlikely to get checked.

test plan:

  • Check that we fail msan-fuzztest prior to this PR, and succeed after this PR.
  • Add msan fuzz test to the github CI

@@ -2382,7 +2383,7 @@ ZSTD_entropyCompressSeqStore_internal(seqStore_t* seqStorePtr,
BYTE* const ostart = (BYTE*)dst;
BYTE* const oend = ostart + dstCapacity;
BYTE* op = ostart;
size_t lastCountSize = 0;
Copy link
Contributor

@Cyan4973 Cyan4973 Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lastCountSize wasn't always initialized

OK, this one is not obvious judging from the content of the PR alone,
since the previous code was initializing variable lastCountSize correctly
(and does no longer now).

What you mean is that there are other places in the code
where ZSTD_buildSequencesStatistics() is invoked
and its parameter lastCountSize isn't initialized before invocation.

This effectively means that the parameter lastCountSize changes from being in+out
to being just out.
A pretty substantial change if you ask me, but one that's invisible from a function signature perspective.

Short of other more invasive technique (supplemental code annotation), I guess the only thing we can do is to properly document the behavior of parameters at interface declaration level.

Another idea would be to put lastCountSize as part of the return structure ZSTD_symbolEncodingTypeStats_t, clearly underlining that it's now a purely out return value, but it's a more invasive change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wanted to keep it as in+out, we could set lastCountSize to 0 here and avoid doing so within the function:

ZSTD_symbolEncodingTypeStats_t stats;
, mirroring the old approach, but I think it makes more sense as a purely out parameter since an input value doesn't make any sense.

I think changing it to return as part of the ZSTD_symbolEncodingTypeStats_t makes sense, if we want to avoid having parameters that exist only as out types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no benefit at keeping it in+out.
Actually, making it purely out is a net gain from a code resiliency perspective.

And yes, making it part of the return structure seems like a natural follow up.

@senhuang42 senhuang42 merged commit 4fe2e7a into facebook:dev Mar 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants